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Abstract 

We consider the solution of large linear systems of equations that arise when two-dimensional sin¬ 
gularly perturbed reaction-diffusion equations are discretized. Standard methods for these problems, 
such as central hnite differences, lead to system matrices that are positive definite. The direct solvers 
of choice for such systems are based on Cholesky factorisation. However, as observed in [^, these 
solvers may exhibit poor performance for singularly perturbed problems. We provide an analysis of 
the distribution of entries in the factors based on their magnitude that explains this phenomenon, and 
give bounds on the ranges of the perturbation and discretization parameters where poor performance 
is to be expected. 
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1 Introduction 


We consider the singularly perturbed two dimensional reaction-diffusion problem: 

- e^Au + b{x,y)u = f{x,y), fl = (0,1)^, u{dVL) = g{x,y), (1) 

where the “perturbation parameter”, e, is a small and positive, and the functions g, b and / are given, 
with 6(x, y)> > 0. 

We are interested in the numerical solution of m by the following standard finite difference technique. 
Denote the mesh points of an arbitrary rectangular mesh as {xi, yj) for f, j G {0,1,..., iV}, write the local 
mesh widths as G = Xi — Xi-i and kj = yj —yj-i, and let hi = {xi+i —Xi-i)/2, and kj = (jjj+i —?/j_i)/2. 
Then the linear system for the finite difference method can be written as 

AU^ = , where A = + hikjb{xi,yj)) , (2a) 

*This author’s work is supported by the Irish Research Council under Grant No. RS/2011/179. 
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and is the symmetrised 5-point second order central difference operator 

/ hj \ 


kj+i 



V kj / 

It is known that the scheme @ applied to m on a boundary layer-adapted mesh with N intervals in each 
direction yields a parameter robust approximation, see, e.g., [am. Since A in (I2a[l is a banded, symmetric 
and positive definite, the direct solvers of choice are variants on Cholesky factorisation. This is based on 
the idea that there exists a unique lower-triangular matrix L (the “Cholesky factor”) such that A = 

(see, e.g., [U Thm. 4.25]). Conventional wisdom is that the computational complexity of these methods 
depends exclusively on N and the structure of the matrix (i.e., its sparsity pattern). However, MacLachlan 
and Madden [6l §4.1] observe that standard implementations of Cholesky factorisation applied to ([2a|) 
perform poorly when e in © is small. Their explanation is that the Cholesky factor, L, contains many 
small entries that fall in to the range of subnormal floating-point numbers. These are numbers that 
have magnitude (in exact arithmetic) between ss 5 x and Ri 2 x 10“^°® (called 

realmin in MATLAB). Numbers greater than 2.2 x 10“®°® are represented faithfully in IEEE standard 
double precision, while numbers less than 2“^*^^"^ are flushed to zero (we’ll call such numbers “underflow- 
zeros”). Floats between these values do not have full precision, but allow for “gradual underflow”, 
which (ostensibly) leads to more reliable computing (see, e.g., [3 Chap. 7]). Unlike standard floating¬ 
point numbers, most CPUs do not support operations on subnormals directly, but rely on microcode 
implementations, which are far less efficient. Thus it is to be expected that it is more time-consuming to 
factorise A in (j^all when e is small. As an example of this, consider ([3) where N = 128 and the mesh 
is uniform. The nonzero entries of the associated Cholesky factor are located on the diagonals that are 
at most a distance N from main diagonal. In [Figure 1] we plot the absolute value of largest entry of a 
given diagonal of L, as a function of its distance from the main diagonal. On the left of [Figure© where 
e = 1, we observe that the magnitude of the largest entry gradually decays away from the location of the 
nonzero entries of A. In contrast, when e = 10“® (on the right), magnitude of the largest entry decays 
exponentially. 

To demonstrate the effect of this on computational efficiency, in lTableTl we show the time, in seconds, 
taken to compute the factorisation of A in (j^all with a uniform mesh, and N = 512, on a single core of 
AMD Opteron 2427, 2200 MHz processor, using CHOLMOD [I] with “natural order” (i.e., without a fill 
reducing ordering). Observe that the time-to-factorisation increases from 52 seconds when e is large, to 
nearly 500 seconds when e = 10“®, when over 1% of the entries are in the subnormal range. When e is 
smaller again, the number of nonzero entries in L is further reduced, and so the execution time decreases 
as well. 




Figure 1: Semi-log plot of maximal entries on diagonals of L with N = 128, and e = 1 (left) and e = 10 ® (right). 
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£ 

10-1 

10-2 

10-3 

10-1 

10-3 

10-6 

Time (s) 

52.587 

52.633 

496.887 

175.783 

74.547 

45.773 

Nonzeros in L 

133,433,341 

133,433,341 

128,986,606 

56,259,631 

33,346,351 

23,632,381 

Subnormals in L 

0 

0 

1,873,840 

2,399,040 

1,360,170 

948,600 

Underflow zeros 

0 

0 

4,446,735 

77,173,710 

100,086,990 

109,800,960 


Table 1: Time taken (in seconds) to compute the Cholesky factor, L, of A in @ on a uniform mesh with N = 512. 

The number of nonzeros, subnormals, and underflow-zeros in L are also shown. 

Our goal is to give an analysis that fully explains the observations of [FigureT] and [Table 1[ and that 
can also be exploited in other solver strategies. We derive expressions, in terms of N and e, for the 
magnitude of entries of L as determined by their location. Ultimately, we are interested in the analysis 
of systems that arise from the numerical solution of © on appropriate boundary layer-adapted meshes. 
Away from the boundary, such meshes are usually uniform. Therefore, we begin in Section 12.11 with 
studying a uniform mesh discretisation, in the setting of exact arithmetic, which provides mathematical 
justification for observations in [Figure fj In Section 12.21 this analysis is used to quantify to number of 
entries in the Cholesky factors of a given magnitude. As an application of this, we show how to determine 
the number of subnormal numbers that will occur in L in a floating-point setting, and also determine an 
lower bound for e for which the factors are free of subnormal numbers. Finally, the Cholesky factorisation 
on a boundary layer-adapted mesh is discussed in Section 12.dl and our conclusions are summarised in 
Section [H 


2 Cholesky factorisation on a uniform mesh 

2.1 The magnitude of the fill-in entries 

We consider the discretisation (I2bll of the model problem ([1]) on a uniform mesh with N intervals on 
each direction. The equally spaced stepsize is denoted hy h = N~^. When s h, which is typical in a 
singularly perturbed regime, the system matrix in (I2al) can be written as the following 5-point stencil 


/ 

-£2 \ 

^ —£2 \ 


-£2 

4£2 - 1 - h%{Xi,yj) -£2 = 

—£2 0{h?) —£2 1 , 

(3) 

V 

/ 

V / 



since -f h?b{xi,yj)) = 0{h‘^), where we write /(•) = 0{g{-)) if there exist positive constants Cq and 
Cl, independent of N and e, such that Co|(7(')| < /(•) < C'i|<?(’)l- 

Algorithm[T]presents a version of Cholesky factorisation which adapted from [H page 143]. It computes 
a lower triangular matrix L such that A = LL'^. We will follow MATLAB notation by denoting A = 
[a{i,j)] and L = [l{i,j)]. 

We set m = A^—l,soAisa sparse, banded m? x rri^ matrix, with a bandwidth of m, and has no more 
than five nonzero entries per row. Its factor, L, is far less sparse: although it has the same bandwidth 
as A, it has 0{m) nonzeros per row (see, e.g., [H Prop. 2.4]). The set of non-zero entries in L that are 
zero in the corresponding location in A is called the fill-in. We want to find a recursive way to express 
the magnitude of these fill-in entries, in terms of e and h. 

To analyse the magnitude of the fill-in entries, we borrow notation from [SI Sec. 10.3.3], and form 
distinct sets denoted ..., where all entries of L of the same magnitude (in a sense explained 
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Algorithm 1 Cholesky factorisation: 
for j = 1 : n 
if j = 1 


for i = j : n 

Khj) = 

end 

elseif {j > 1) 

for i = j : n 

KiJ) = 

end 


ajhj) 
\/a(j, j) 


ajhj) - ELi Khk)l{j,k) 


end 

end 


carefully below) belong to the same set. We denote by the magnitude of entries in i.e., 
if and only if l^i^j) is We shall see that these sets are quite distinct, meaning that /[*:+i] 

for k >1. is used to denote the set of nonzero entries in A, and entries of L that are zero (in exact 
arithmetic) are defined to belong to 


In Algorithm [U all the entries of L are initialised as zero, and so belong to L^°°\ Suppose that is 
such that G so, initially, each pij = oo. At each sweep through the algorithm, a new value 

of l{i,j) is computed, and so pij is modified. From line 8 in Algorithm [T] we can see that the pij is 
updated by 


P^,J = 


min{0,p,_i 1 + l,Pi2 +Pj,2 + 1,- ■ •,Piy_i + 1}, if a{i,j) ^ 0, 

niin{Pi,i +Pj,i + +Pj ,2 + 1, • ■ -^Pij-i +Pi, 3 -i + 1}> otherwise. 


Then, as we shall explain in detail below, it can be determined that L has a block structure shown 
in (|lall - (|^ . where, for brevity, the entries belonging to are denoted by [fc], and the entries that 
corresponding to nonzero entries of original matrix are written in terms of their magnitude: 



/m 


\ 



/ 0{h) 



\ 


P 

Q 




0{e^lh) 

0(h) 



L = 


P Q 


, where M = 


0{e^lh) 0{h) 




1 

P 

q) 



\ 


0{£^lh) 

Oih)) 


(0{e‘^/h) 


[1] 

[2] 

[3] 

[m — 2] 

[m - 1] ^ 




0{e^lh) 

[1] 

[2] 

[m — 3] 

[m — 2] 



0{e^/h) [1] 

0(e^lK) 

V 

/ 0(h) 

0{e^/h) 0{h) 

[3] 0{e‘^/h) 0{h) 

[4] [3] 0{e^/h) 0{h) 


[2] [3] 

[ 1 ] [ 2 ] 

0{e^/h) [1] 

0{e^lh)) 

\ 


(4b) 


(4c) 


[m — 1] [m — 2] 

\ H [m - 1] 


[3] 0{e‘^/h) Oih) 

[4] [3] 0{e'^/h) 0{h)) 
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We now explain why the entries of L, which are computed by column, have the structure shown in (|^. 
According to Algorithm [T] the first column of L is computed by 1) = a(i, l)/y^a(l, 1), which shows 
that there is no fill-in entry in this column. For the second column, the only fill-in entry is 

a{m + 1,2)-l{m + l,l)l{2,l) 0 - 0{e^lh)0{e^ jh) ,^,^4 

°— W) —>' 

where l{m + 1,1) and l{2, 1) belong to so l(m + 1,2) is in Similarly, there are two fill-ins in 
third column: /(m + 1, 3) and Urn + 2,3). The entry /(m + 1, 3) is computed as 

Km -H .tl = + b 3) - ELi Km + 1, k) ^ -/(to- fl, 2)1(3, 2) 

which is 0{e^/h^)- moreover, since /(to + 1, 2) € and /(3, 2) € so /(to -I- 1, 3) G Similarly, it 
is easy to see that lira + 2, 3) G We may now proceed by induction to show that /(to -|- 1, j -|- 1) = 

0 (£ 2 O-i-i)y'^( 2 j-i-i)) i 30 iongs to L^\ for 1 < j < to — 2. Suppose l(m + l,j) = G 

Then 


l{m + l,j + l) 


a{m + 1, jW 1) - Yfk=i 

y/aUl) 

-/(to + 1, j)^i + 1, j) 


since l{j + l,k) = 0 , < j — 1 , 




And, because l{j + l,j) G we can deduce that l{m -I- 1, j + 1) G The process is repeated from 
column 1 to column to, yielding the pattern for P shown in (I4bl) . 


A similar process is used to show that Q is as given in Its first fill-in entry is l{m + 3,m + 1). 

Note that a(m -I- 3, to -|- 1 ) = Urn + 1,1) = /(to + 1 , 2 ) = 0, that the magnitude of the entry in is 
0 (£ 2 (j-i-i)^^( 2 j-i-i))^ and that the sum of two entries of the different magnitude has the same magnitude 
as larger one. Then 


/(to + 3, to + 1) 


- + 3, k)l{m + 1, k) 


\J a{m + 1, TO + 1) 


0[ - ]Ol ^ 


+0 


2(m-2) 


/j(2(m-3)-H) 




o 


e 

2(m) 


/j(2(m-l)-Hl) 




o{h) 




1 


and so /(to + 3, to + 1) belongs to Proceeding inductively, as was done for P shows that Q has the 
form given in (I4cl) . Furthermore, the same process applies to each block of L in (I4al) . Summarizing, we 
have established the following result. 


Theorem 1. The fill-in entries of the Cholesky factor L of the matrix A defined in m is as given in 
W- Moreover, setting 6 = e/h, the magnitude /[^^ is 

/W = C)^e2(fc+i)//i(2fc+i)^ = /or fc=l,2,...,TO. (5) 


2.2 Distribution of fill-in entries in a floating-point setting 

In practice, Cholesky factorisation is computed in a floating-point setting. As discussed in Section [U 
the time taken to compute these factorisations increases greatly if there are many subnormal numbers 
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present. Moreover, even the underflow-zeros in the factors can be expensive to compute, since they 
typically arise from intermediate calculations involving subnormal numbers. Therefore, in this section we 
use the analysis of Section [2.11 to estimate, in terms of e and TV, the number of entries in L that are of a 
given magnitude. From this, one can easily predict the number of subnormals and underflow-zeros in L. 

Lemma 1. Let A he the vn? x w? matrix in where the mesh is uniform. Then the number of nonzero 
entries in the Cholesky factor L (i.e., A = LLF) computed using exact arithmetic is 

+ m — (6) 

Proof. Since A has bandwidth m, and so too does L ([3l Prop. 2.3]). By the Algorithm [U the fill-in 
entries only occur from row (m -|- 1). So, from row (m -I- 1), any row of L has (m -I- 1) nonzero entries 
and there are m(m — 1) such rows, plus 2m — 1 nonzero entries from top-left block M in (I4al) . Summing 
these values, we obtain (jHl). □ 

Let be the number of fill-in entries which belong to To estimate it is sufficient to 

evaluate the fill-in entries in the submatrices P and Q shown in (U)) . ITable 21 describes the number of 
hll-in entries associated with their magnitude. 



|lW| in P 

|lW| in Q 

|lW| in [P,Q] 

lW 

TO — 1 

0 

TO — 1 

L[21 

TO — 2 

0 

TO — 2 

L[31 

TO — 3 

TO — 2 

2to — 5 

lW 

m — k 

TO — fc -f 1 

2to — 2fc -|- 1 


2 

3 

5 


1 

2 

3 


0 

1 

1 


Table 2: Number of fill-in entries in P and Q associated with their magnitude. 


Note that there are (m — 1) blocks like [P,Q] in L. Then, since and the smallest (exact) 

nonzero entries belong to lI*”! we can use ITable 2l to determine the number of entries that are at most 

for some given p as: 

{ (m — 1)(2 to — 3) -I- (m — l)(m — 2)^ = (m — 1)^ p = 1, 

(m — l)(m — 2) + {m — l){m — 2)^ = {m — 2)(m — 1)^ p = 2, 

(m — l)(m — p-\-lY P > 3. 

These equations can be combined and summarised as follows. 

Theorem 2. Let A he the matrix of the form ©• Then, the number of fill-in entries associated with 
their magnitude of the matrix L satisfies 

m 

< (m-l)(m-p-bl)^, p > 1. (7) 

k—p 
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Combining Theorems [T] and [5] enables us to accurately predict the total number, and location, of 
subnormal and underflow-zero entries in L, for given N and e. For example, recall [Figure f] where we 
took e = 10“® and N = 128. To determine, using [Theorem ll the diagonals where entries are subnormal, 
we solve 

(£jV)2('=+i) = (8) 

for N = 128 and e = 10“®, which yields k fv 38. It clearly agrees with the observation in [Figure 1| i.e., 
the maximal value of the entries on diagonals 38 and iV — 38 = 90 are less than realmin. Similarly, all 
entries on diagonals between 40 and 88 are flushed to zero. 

As a further example, letting N = 512 and e = 10“®, by ©, the total number of underflow-zero and 
subnormal entries in L are, respectively, 

511 47 511 511 

|lW I = 109,800,960, and Y I = 1^'*’ I “ H I = 

fc—48 /c—46 k—46 k—4S 

This is exactly what is observed in Table [H Moreover, the total number of entries with magnitude 
less than realmin is 110,749,560 which is over 80% of the exact nonzero entries (cf. Lemma [T]) in L: 
133,433,341. Such a predictable appearance of subnormals and underflows is important in the sense of 
choosing suitable linear solvers, i.e., direct or iterative ones. 

More generally, we can use ([5]) to investigate ranges of N and e for which subnormal entries occur 
(assuming e < N~^). Since the largest possible value of fc is m, a Cholesky factor will have subnormal 
entries if e and N are such that Rearranging, this gives that 

e < 1 (2-1022^)1/(2^) ^ (g) 

The function g defined in ([9|) is informative because it gives the largest value of e for a discretisation 
with given N leads to a Cholesky factor with entries less than 2 - 1022 . For example, [Figure 2] (on the 
left) shows g{N) for N G [200,500]. It demonstrates that, for e < 1.05 x 10-^ (determined numerically), 
subnormal entries are to be expected for some values of N (cf. [Table 11) . The line e = 10-^ intersects 
g at approximately N = 263 and N = 484, meaning that a discretisation with 263 < N < 484 yields 
entries with the magnitude less than 2-1022 in L for e = IQ-o. On the right of [Figure~2] we show that, for 
large N, g{N) decays like A^-i. Since we are interested in the regime where £ < A^-i, this shows that, 
for small e, subnormals are to be expected for all but the smallest values of N. 




Figure 2: The function g{N) defined in (|9|) with N € [200,500] (left) and N € [1,5000] (right). 


2.3 Boundary layer-adapted meshes 

Our analysis so far has been for computations on uniform meshes. However, a scheme such as ([2l) for 
o is usually applied on a layer-adapted mesh, such as a Shishkin mesh. For these meshes, in the 
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neighbourhood of the boundaries, and especially near corner layers, the local mesh width is 0 {eN~^) in 
each direction, and so the entries of the system matrix are of the same order, and no issue with subnormal 
numbers is likely to arise. However, away from layers, these fitted meshes are usually uniform, with a 
local mesh width of and so the analysis outlined above applies directly. Since roughly one 

quarter (depending on mesh construction) of all mesh points are located in this region, the influence on 
the computation is likely to be substantial. 

The main complication in extending our analysis to, say, a Shishkin mesh, is in considering the “edge 
layers”, where the mesh width may be 0 {eN~^) in one coordinate direction, and 0 {N~^) in another. 
Although we have not analysed this situation carefully, in practise it seems that the factorisation behaves 
more like a uniform mesh. This is demonstrated in ITable 3l below. Comparing with ITable ll we see, for 
small e, the number of entries flushed to zero is roughly three-quarters that of the uniform mesh case. 


5 

10-1 

10-2 

10-3 

lO-'i 

10-3 

10-3 

Time (s) 

52.580 

58.213 

447.533 

179.540 

101.507 

73.250 

Nonzeros in L 

133,433,341 

133,240,632 

127,533,193 

78,091,189 

62,082,599 

54,497,790 

Subnormals in L 

0 

28,282 

2,648,308 

1,669,345 

1,079,992 

814,291 

Underflow zeros 

0 

192,709 

5,900,148 

55,342,152 

71,350,742 

78,935,551 


Table 3: Time taken (in seconds) to compute the Cholesky factor, L, of A in @ on a Shishkin mesh with N = 512. 
The number of nonzeros, snbnormals, and underflow-zeros in L are also shown. 


3 Conclusions 


The paper addresses, in a comprehensive way, issues raised in by showing how to predict the number 
and location of subnormal and underflow entries in the Cholesky factors of A in ((2a1l for given e and N. 

Further developments on this work are possible. In particular, the analysis shows that, away from the 
existing diagonals, the magnitude of fill-in entries decay exponentially, as seen in ©, a fact that could 
be exploited in the design of preconditioners of iterative solvers. For example, as shown in lLemma 11 the 
Cholesky factor of A, in exact arithmetic, has 0{N^) nonzero entries. However, Theorem [5] shows that, 
in practice (i.e., in a float-point setting), there are only 0{N‘^) entries in L when e is small and N is 
large. This suggests that, for a singularly perturbed problem, an incomplete Cholesky factorisation may 
be a very good approximation for L. This is a topic of ongoing work. 

In this paper we have restricted our study to Cholesky factorisation of the coefficient matrix arising 
from finite-difference discretisation of the model problem o on a uniform and a boundary layer-adapted 
mesh, the same phenomenon is also observed in more general context of singularly perturbed problems. 
That includes the LH-factorisations of the coefficient matrices coming from both finite difference and 
finite element methods applied to reaction-diffusion and convection-diffusion problems, though further 
investigation is required to establish the details. 


References 

[1] Y. Chen, T.A. Davis, W.W. Hager, and S. Rajamanickam, Algorithm 887: CHOLMOD, supernodal 
sparse cholesky factorization and update/downdate, ACM Trans. Math. Softw., 35 (2008), pp. 22:1- 
22:14. 


8 
















[2] C. Clavero, J.L. Gracia, and E. O’Riordan. A parameter robust numerical method for a two dimen¬ 
sional reaction-diffusion problem. Math. Comput., 74(252):1743-1758, 2005. 

[3] James W. Demmel. Applied numerical linear algebra. Society for Industrial and Applied Mathemat¬ 
ics (SIAM), Philadelphia, PA (1997). 

[4] Gene H. Golub and Charles F. Van Loan. Matrix computations. Johns Hopkins Studies in the 
Mathematical Sciences. Johns Hopkins University Press, Baltimore, MD, third edition (1996). 

[5] Torsten Linfi. Layer-adapted meshes for reaction-convection-diffusion problems, volume 1985 of Lec¬ 
ture Notes in Mathematics. Springer-Verlag, Berlin, 2010. 

[6] S. MacLachlan and N. Madden, Robust solution of singularly perturbed problems using multigrid 
methods, SIAM J. Sci. Comput., 35 (2013), pp. A2225-A2254. 

[7] Michael L. Overton. Numerical computing with IEEE floating point arithmetic. Society for Industrial 
and Applied Mathematics (SIAM), Philadelphia, PA, 2001. 

[8] Saad, Y. Iterative methods for sparse linear systems, Society for Industrial and Applied Mathematics, 
Philadelphia, PA, second edition (2003). 


9 



